Search CORE

72 research outputs found

Longitudinal detection of radiological abnormalities with time-modulated LSTM

Author: A Esteva
FA Gers
S Hochreiter
V Gulshan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Convolutional neural networks (CNNs) have been successfully employed in recent years for the detection of radiological abnormalities in medical images such as plain x-rays. To date, most studies use CNNs on individual examinations in isolation and discard previously available clinical information. In this study we set out to explore whether Long-Short-Term-Memory networks (LSTMs) can be used to improve classification performance when modelling the entire sequence of radiographs that may be available for a given patient, including their reports. A limitation of traditional LSTMs, though, is that they implicitly assume equally-spaced observations, whereas the radiological exams are event-based, and therefore irregularly sampled. Using both a simulated dataset and a large-scale chest x-ray dataset, we demonstrate that a simple modification of the LSTM architecture, which explicitly takes into account the time lag between consecutive observations, can boost classification performance. Our empirical results demonstrate improved detection of commonly reported abnormalities on chest x-rays such as cardiomegaly, consolidation, pleural effusion and hiatus hernia.Comment: Submitted to 4th MICCAI Workshop on Deep Learning in Medical Imaging Analysi

arXiv.org e-Print Archive

Crossref

King's Research Portal

Comparison of System Call Representations for Intrusion Detection

Author: A Sharma
AP Kosoresow
E Eskin
FA Gers
G Creech
H He
J McHugh
N Srivastava
S Hochreiter
SA Hofmeyr
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/05/2019
Field of study

Over the years, artificial neural networks have been applied successfully in many areas including IT security. Yet, neural networks can only process continuous input data. This is particularly challenging for security-related non-continuous data like system calls. This work focuses on four different options to preprocess sequences of system calls so that they can be processed by neural networks. These input options are based on one-hot encoding and learning word2vec or GloVe representations of system calls. As an additional option, we analyze if the mapping of system calls to their respective kernel modules is an adequate generalization step for (a) replacing system calls or (b) enhancing system call data with additional information regarding their context. However, when performing such preprocessing steps it is important to ensure that no relevant information is lost during the process. The overall objective of system call based intrusion detection is to categorize sequences of system calls as benign or malicious behavior. Therefore, this scenario is used to evaluate the different input options as a classification task. The results show, that each of the four different methods is a valid option when preprocessing input data, but the use of kernel modules only is not recommended because too much information is being lost during the mapping process.Comment: 12 pages, 1 figure, submitted to CISIS 201

arXiv.org e-Print Archive

Crossref

On the performance of deep learning models for time series classification in streaming

Author: A Bifet
A Borovykh
A Cano
A Ghazikhani
FA Gers
HM Gomes
J Gama
JY Fernández-Rodríguez
Y Zhang
Publication venue
Publication date: 01/01/2020
Field of study

Processing data streams arriving at high speed requires the development of models that can provide fast and accurate predictions. Although deep neural networks are the state-of-the-art for many machine learning tasks, their performance in real-time data streaming scenarios is a research area that has not yet been fully addressed. Nevertheless, there have been recent efforts to adapt complex deep learning models for streaming tasks by reducing their processing rate. The design of the asynchronous dual-pipeline deep learning framework allows to predict over incoming instances and update the model simultaneously using two separate layers. The aim of this work is to assess the performance of different types of deep architectures for data streaming classification using this framework. We evaluate models such as multi-layer perceptrons, recurrent, convolutional and temporal convolutional neural networks over several time-series datasets that are simulated as streams. The obtained results indicate that convolutional architectures achieve a higher performance in terms of accuracy and efficiency.Comment: Paper submitted to the 15th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2020

arXiv.org e-Print Archive

Crossref

idUS. Depósito de Investigación Universidad de Sevilla

Language Identification in Short Utterances Using Long Short-Term Memory (LSTM) Recurrent Neural Networks

Author: A Graves
A Graves
A Lozano-Diez
A rahman Mohamed
Alicia Lozano-Diez
CM Bishop
D Martinez
D Martinez
D Reynolds
D Yu
Doroteo T. Toledano
F Gers
F Richardson
F Weninger
FA Gers
FA Gers
G Hinton
H Li
Ian McLoughlin
J Gonzalez-Dominguez
J Gonzalez-Dominguez
J Schmidhuber
Javier Gonzalez-Dominguez
Joaquin Gonzalez-Rodriguez
M Van Segbroeck
N Dehak
N Dehak
P Kenny
PA Torres-Carrasquillo
Ruben Zazo
Y Song
YK Muthusamy
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2016
Field of study

Zazo R, Lozano-Diez A, Gonzalez-Dominguez J, T. Toledano D, Gonzalez-Rodriguez J (2016) Language Identification in Short Utterances Using Long Short-Term Memory (LSTM) Recurrent Neural Networks. PLoS ONE 11(1): e0146917. doi:10.1371/journal.pone.0146917Long Short Term Memory (LSTM) Recurrent Neural Networks (RNNs) have recently outperformed other state-of-the-art approaches, such as i-vector and Deep Neural Networks (DNNs), in automatic Language Identification (LID), particularly when dealing with very short utterances (similar to 3s). In this contribution we present an open-source, end-to-end, LSTM RNN system running on limited computational resources (a single GPU) that outperforms a reference i-vector system on a subset of the NIST Language Recognition Evaluation (8 target languages, 3s task) by up to a 26%. This result is in line with previously published research using proprietary LSTM implementations and huge computational resources, which made these former results hardly reproducible. Further, we extend those previous experiments modeling unseen languages (out of set, OOS, modeling), which is crucial in real applications. Results show that a LSTM RNN with OOS modeling is able to detect these languages and generalizes robustly to unseen OOS languages. Finally, we also analyze the effect of even more limited test data (from 2.25s to 0.1s) proving that with as little as 0.5s an accuracy of over 50% can be achieved.This work has been supported by project CMC-V2: Caracterizacion, Modelado y Compensacion de Variabilidad en la Señal de Voz (TEC2012-37585-C02-01), funded by Ministerio de Economia y Competitividad, Spain

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Directory of Open Access Journals

PubMed Central

Biblos-e Archivo

Comparing Hidden Markov Models and Long Short Term Memory Neural Networks for Learning Action Representations

Author: K Sugiura
L Rabiner
N Weghe
T Bruss
A Stolcke
RN Shepard
FA Gers
S Hochreiter
T Tieleman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1989
Field of study

Panzner M, Cimiano P. Comparing Hidden Markov Models and Long Short Term Memory Neural Networks for Learning Action Representations. In: Pardalos PM, Conca P, Giuffrida G, Nicosia G, eds. Machine Learning, Optimization, and Big Data : Second International Workshop, MOD 2016, Volterra, Italy, August 26-29, 2016. Revised Selected Papers. Lecture Notes in Computer Science. Vol 10122. Cham: Springer International Publishing; 2016: 94-105

Crossref

Queensland University of Technology ePrints Archive

Publications at Bielefeld University

Star-forming cores embedded in a massive cold clump: Fragmentation, collapse and energetic outflows

The fate of massive cold clumps, their internal structure and collapse need to be characterised to understand the initial conditions for the formation of high-mass stars, stellar systems, and the origin of associations and clusters. We explore the onset of star formation in the 75 M_sun SMM1 clump in the region ISOSS J18364-0221 using infrared and (sub-)millimetre observations including interferometry. This contracting clump has fragmented into two compact cores SMM1 North and South of 0.05 pc radius, having masses of 15 and 10 M_sun, and luminosities of 20 and 180 L_sun. SMM1 South harbours a source traced at 24 and 70um, drives an energetic molecular outflow, and appears supersonically turbulent at the core centre. SMM1 North has no infrared counterparts and shows lower levels of turbulence, but also drives an outflow. Both outflows appear collimated and parsec-scale near-infrared features probably trace the outflow-powering jets. We derived mass outflow rates of at least 4E-5 M_sun/yr and outflow timescales of less than 1E4 yr. Our HCN(1-0) modelling for SMM1 South yielded an infall velocity of 0.14 km/s and an estimated mass infall rate of 3E-5 M_sun/yr. Both cores may harbour seeds of intermediate- or high-mass stars. We compare the derived core properties with recent simulations of massive core collapse. They are consistent with the very early stages dominated by accretion luminosity.Comment: Accepted for publication in ApJ, 14 pages, 7 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Exploring spatial-frequency-sequential relationships for motor imagery classification with recurrent neural network

Abstract Background Conventional methods of motor imagery brain computer interfaces (MI-BCIs) suffer from the limited number of samples and simplified features, so as to produce poor performances with spatial-frequency features and shallow classifiers. Methods Alternatively, this paper applies a deep recurrent neural network (RNN) with a sliding window cropping strategy (SWCS) to signal classification of MI-BCIs. The spatial-frequency features are first extracted by the filter bank common spatial pattern (FB-CSP) algorithm, and such features are cropped by the SWCS into time slices. By extracting spatial-frequency-sequential relationships, the cropped time slices are then fed into RNN for classification. In order to overcome the memory distractions, the commonly used gated recurrent unit (GRU) and long-short term memory (LSTM) unit are applied to the RNN architecture, and experimental results are used to determine which unit is more suitable for processing EEG signals. Results Experimental results on common BCI benchmark datasets show that the spatial-frequency-sequential relationships outperform all other competing spatial-frequency methods. In particular, the proposed GRU-RNN architecture achieves the lowest misclassification rates on all BCI benchmark datasets. Conclusion By introducing spatial-frequency-sequential relationships with cropping time slice samples, the proposed method gives a novel way to construct and model high accuracy and robustness MI-BCIs based on limited trials of EEG signals

Crossref

Aberystwyth Research Portal

Directory of Open Access Journals

Long Short-Term Memory Recurrent Neural Network for Stroke Prediction

Author: CR Cooke
DB Scheurer
EC Leira
FA Gers
FA Gers
K Greff
LB Goldstein
LB Goldstein
LDP Ried
P Langhorne
RC Deo
V Gulshan
Y LeCun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

© 2018, Springer International Publishing AG, part of Springer Nature. Electronic Healthcare Records (EHRs) describe the details about a patient’s physical and mental health, diagnosis, lab results, treatments or patient care plan and so forth. Currently, the International Classification of Diseases, 10th Revision or ICD-10 code is used for representing each patient record. The huge amount of information in these records provides insights about the diagnosis and prediction of various diseases. Various data mining techniques are used for the analysis of data deriving from these patient records. Recurrent Neural Network (RNN) is a powerful and widely used technique in machine learning and bioinformatics. This research aims at the investigation of RNN with Long Short-Term Memory (LSTM) hidden units. The empirical research is intended to evaluate the ability of LSTMs to recognize patterns in multi-label classification of cerebrovascular symptoms or stroke. First, we integrated ICD-10 code into health record, as well as other potential risk factors within EHRs into the pattern and model for prediction. Next, we modelled the effectiveness of LSTMs for prediction of stroke based on healthcare records. The results show several strong baselines that include accuracy, recall, and F1 measure score

Crossref

OPUS - University of Technology Sydney

Long Short-Term Memory Learns Context Free and Context Sensitive Languages

Author: BA Pearlmutter
FA Gers
P Rodriguez
P Rodriguez
S Hochreiter
Publication venue: Springer
Publication date: 01/01/2001
Field of study

Previous work on learning regular languages from exemplary training sequences showed that Long ShortTerm Memory (LSTM) outperforms traditional recurrent neural networks (RNNs). Here we demonstrate LSTM's superior performance on context free language (CFL) benchmarks, and show that it works even better than previous hardwired or highly specialized architectures. To the best of our knowledge, LSTM variants are also the rst RNNs to learn a context sensitive language (CSL), namely,

CiteSeerX

Crossref

Human activity recognition using recurrent neural networks

Author: A Holzinger
D Roggen
EM Tapia
FA Gers
FA Gers
G Hinton
J Ye
J Yuen
L Bao
MW Ribbe
S Hochreiter
TL Kasteren
W Wu
Y Zhu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/08/2017
Field of study

Part 5: MAKE AALInternational audienceHuman activity recognition using smart home sensors is one of the bases of ubiquitous computing in smart environments and a topic undergoing intense research in the field of ambient assisted living. The increasingly large amount of data sets calls for machine learning methods. In this paper, we introduce a deep learning model that learns to classify human activities without using any prior knowledge. For this purpose, a Long Short Term Memory (LSTM) Recurrent Neural Network was applied to three real world smart home datasets. The results of these experiments show that the proposed approach outperforms the existing ones in terms of accuracy and performance

HAL-CentraleSupelec

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1